Abstract
Introduction: Relapsed pediatric acute lymphoblastic leukemia (ALL) requires intensive treatment, increasing the risk of late effects and toxicity from multi-modal therapy. Improved relapse prediction is critical for optimizing therapy and reducing late effects. We aimed to find a protein signature in diagnostic blood or bone marrow samples that distinguishes patients who will relapse (R) versus non-relapsers (NR).
Methods: Pediatric ALL samples (bone marrow or peripheral blood) obtained at diagnosis were acquired from Australian biobanks affiliated with the Children's Hospital at Westmead (CHW) and the Royal Children's Hospital in Melbourne (RCH). CHW patients were treated on a Berlin-Frankfurt-Münster Study 8/9 protocol with a mix of all risk groups. RCH patients were predominantly treated on a Children's Oncology Group (AALL 0932/AALL 1131) or St. Jude's Total XVII protocol, also inclusive of all risk groups. Only patients with B-ALL and for whom diagnostic samples were available were included.
Samples were analyzed via data-independent acquisition (DIA) mass spectrometry (MS) through the Proteome of Cancer (ProCan®) program in Sydney. After proteomic quality control and filtering on non-missing clinical data, the final study group consisted of 117 patients (CHW cohort-63; RCH-54). A published cohort from British Columbia Children's Hospital with publicly available DIA MS proteomic data consisting of 25 ALL samples after the above filtering was accessed; all 3 cohorts underwent normalization and correction for batch effects individually and were harmonized for ensemble machine learning.
The ensemble model incorporated Cox proportional hazards (CoxPH), random survival forest, and linear multi-task logistic regression base models, combined with forward stepwise feature selection, following least absolute shrinkage and selection operator (LASSO) regularization.
Results: A 13-protein biomarker signature was derived using feature selection. Three proteins were significantly upregulated in the R group, and 10 were downregulated. A risk score was derived from the hazard ratios of these proteins to dichotomize patients into high- or low-risk groups for relapse. The ensemble model was refined by weighting the performance of each base model to improve overall predictive accuracy.
Model validation used a 3-iteration approach: in each round, 1 ALL cohort was designated as the training dataset (with an internal 50/50 train-test split stratified by gender and age groups), and the remaining 2 cohorts served as external validation datasets. This strategy yielded performance accuracies of 88–100%, demonstrating strong generalizability of the ensemble model across cohorts.
The 13-protein signature also effectively distinguished R and NR samples using linear discriminant analysis, with statistically significant separation (p < 0.05). Univariate analysis of clinical variables, including gender, age, white cell count, cytogenetics (CG), and measurable residual disease (MRD) at the end of induction, showed that these parameters were not consistently predictive of relapse in both RCH and CHW cohorts. In contrast, the protein signature remained consistently predictive. Multivariate CoxPH modeling, including MRD and CG with the protein signature, demonstrated a concordance index (CI) of 86% in CHW and 91% in the RCH cohort. The CI of the protein signature alone was 84%, demonstrating the improved predictive accuracy of the proteomic signature when integrated with clinical variables. Conclusions and Future Directions: We have developed a 13-protein signature capable of distinguishing between relapse and non-relapse patients using diagnostic samples. We acknowledge, however, that prognostic factors (for relapse risk prediction) are treatment dependent. Given the recent integration of immunotherapy into frontline B-ALL protocols, this signature may require revision for future cohorts. We plan to prospectively test this signature—alongside clinical variables such as MRD and CG—in patients receiving current treatments, to assess its robustness in predicting relapse as therapeutic strategies evolve.
This feature is available to Subscribers Only
Sign In or Create an Account Close Modal